Deutsche Edition 1

home *** CD-ROM | disk | FTP | other *** search

/ Deutsche Edition 1 / Deutsche Edition 1.iso / amok / 031-040 / amok35 / spellchecker / spellchecker.doc < prev next >

Wrap

Text File | 1993-11-04 | 8KB | 135 lines

SpellChecker - short Documentation for Users ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The SpellChecker is a program to checking the right writing of words in textfiles. You can use it without any problems with ASCII-textfiles. If You use it with files of word-prozessing-programs, there may be some problems, because of strange control-sequences in these files. A other problem can occur, if You use SpellChecker with a word-processor, that saves the files with a constant linelength. If You than correct errors in these lines, the new linelength will differ from the former length, and the word-processor may have problems reading the modified file. Test it with an old text. The basis of this program is a list (Array) of correct written words. If SpellChecker checks a text, it reads a word from this text and searchs for this word in the list. If the word is found, it is assumed to be correct, otherwise this word can be wrong and the user has the possibility to change it. But how get we this list of correct written words? Well, its easy. We join some textfiles (from different autors) to a big textfile. (The textfiles we can get from PD-Docs or from Disk-magazines). Than we read the words from this textfile and sort than into the list. If we read a word that is already in the list, we increase a counter related to this word. So we get a list of words. The counters show the frequency of this words. If the list is big enough, we can delete all words with low counters. These words can be wrong or extremely seldom. The other words with high counters are used often by different authors, so we assume that these are correct words. This was the basic idea of this program, now I explain how to use it. Start it from WB by double-clicking its Icon or from CLI with [run] SpellChecker. But this program needs the ARP-Library, the Divice T: and some Memory , I think 300 kByte free ram should be enough. If You have started the program, You see a Window with ten Boolean-Gadgets and a StringGadget. At this time the list of words (this list I will call sometimes Lexikon) contains no words. To fill the Lexikon, You can load it, or generate it. Generating is only necessary if You will create a Lexikon for a new language. With this program I will give You two Lexikons, one with english, and one with german words. For other languages You have to generate Your own. Generating or expanding a Lexikon is the same procedure. First You need a big textfile with the words You want to use for the Lexikon. The savest method is to use ASCII-Files. (If You use files of Your word-processor, you should make a small test, see "ExportLex" ). Assume You have a big textfile on a Disk in drive df1: with name "BigTextFile". Now click On the Gadget "ExpandLex". You will see the ARP-FileRequester. I think you should know this requester, so I don't explain it. Click in this Requester on "Drives", than on "DF1:", than on "BigTextFile" and at least on "OK". Now the Requester will vanish, all Gadgets will be ghostered and the mouse-pointer will sleep, indicating that the program is working and can't react on Your input. Now the program generates a Lexikon. If the Lexikon was empty, a new Lexikon will be generated, otherwise the current Lexikon is expanded. This generating or expanding will take some time. For example, I have use for such a generating a 600 kByte textfile in ram: It takes over 60 minutes to generate with this big file a Lexikon containing 11000 words. And with diskdrives it takes much more time, because the sourcetext is readed 4 times. Sorry for this long waiting-time, but of course You have to wait only one time. If the generating is completed, the Gadgets will get there normal Image, and in the textarea there You can read something like this: Words: 9999 MinCount: 1 MaxCount 178 Now you have a Lexikon with 9999 words. You should save it by clicking on "SaveLex". The ARP-Filerequester appears. You can use the default name "Lexikon" to save it, or You can change it in the Stringgadget. Than click on "OK" to save it. Well, now You have 9999 words, but some of this words may be wrong written or may be extremely seldom words or names. To delete these, click on the Gadget "CleanLex". Every click will delete all words from the Lexikon that have the lowest counters. The first click will delete words with counter=1, the next click the words with counter=2 and so on. Don't worry, if you have clicked too often, so that there are now only a few word in the Lexikon, You can load the original Lexikon back from disk by clicking on "LoadLex". For example, if MinCount=3, then all words in the Lexikon are found tree or more times in the text that You have used to generate the Lexikon. If MaxCount=56, there was no word in this text that was found for more than 56 times. (MaxCount will not grow, if it already 255) Now You have a Lexikon which You can load each time when You are using the SpellChecker. To check a textfile for correct writing, load the Lexikon and then click on "CheckText". Now load this textfile in the same way as You have loaded the Lexikon. After loading, the SpellChecker will start to examine the text. It read from this text a word and searchs for this word in the Lexikon. If it founds it, this word is assumed to be a correct written word and SpellChecker reads the next word. Otherwise, if the word is not found in the Lexikon, this word can be wrong written, and You, the user, have to decide if it is correct or not. You can correct this word in the StringGadget. If You hit "Return" or click on "Ignore", this word is corrected in the textfile, but it is not added to the Lexikon. If You click on "AddToLex" this word is corrected and added to the Lexikon. The SpellChecker distinguish between words with upper and lower case! SpellChecker knows, that the first letter of a sentence have to be upper-case. If You click on "AddToLex", You have to pay attention to the first letter of this word. Add words only to the Lexikon in the form as the word is written in the MIDDLE OF A SENTENCE!!! For example in the sentence: "This is a short sentence" the first letter in the word "This" is upper case, and this is correct, but You should NOT add this word in this form to the Lexikon because the normal writing of the word "this" is with the first letter in lower case! You can correct all words in this way, or cancel this operation by clicking on the "WindowCloseGadget" or on "Quit". If You click on "Quit" or the "WindowCloseGadged" before all words are corrected, the textfile will be unchanged. If You click on "DelWords", You can delete single words from the Lexikon. For example, if You suppose that there is a wrong written word in the Lexikon, than click on "DelWords", type this word in the Stringgadget and press "RETURN" or click on "DeleteIt" to try to delete this word. If this word exists in the Lexikon, then SpellChecker will delete it, otherwise it will display a text with the message "Word not found". To leave this mode, click on "Quit" or the "WindowClosegadget". The last Gadget is the "ExportGadget". With this Gadgets it is possible to export all words of the Lexikon to a textfile. To export it, click on "ExportLex" and than enter a name for this textfile. After exporting, You can use an editor to look on this file and to delete words You don't like. Than You can import this file again with the Gadget "ExpandLex". Hint: If You use CleanLex, all deleted words are exported to a file "T:CleanLex.txt". You can use this file in the same way as the Export-File, for example You delete all wrong words with an editor and than import the other words again by using "ExpandLex". Stefan Salewski, 16 March 1990